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The analysis of physical measurements often copes with highly correlated noises and interruptions 
caused by outliers, saturation events or transmission losses. We assess the impact of missing data 
on the performance of linear regression analysis involving the fit of modeled or measured time series. 

We show that data gaps can significantly alter the precision of the regression parameter estimation 
in the presence of colored noise, due to the frequency leakage of the noise power. We present a 
regression method which cancels this effect and estimates the parameters of interest with a precision 
comparable to the complete data case, even if the noise power spectral density (PSD) is not known 
a priori. The method is based on an autoregressive (AR) fit of the noise, which allows us to build 
an approximate generalized least squares estimator approaching the minimal variance bound. The 
method, which can be applied to any similar data processing, is tested on simulated measurements 
of the MICROSCOPE space mission, whose goal is to test the Weak Equivalence Principle (WEP) 
with a precision of 10 -15 . In this particular context the signal of interest is the WEP violation 
signal expected to be found around a well defined frequency. We test our method with different gap 
patterns and noise of known PSD and find that the results agree with the mission requirements, 
decreasing the uncertainty by a factor 60 with respect to ordinary least squares methods. We show 
that it also provides a test of significance to assess the uncertainty of the measurement. 

PACS numbers: 04.80.Cc, 04.80.Nn,07.87.+v,95.55.-n,07.05.Kf 

Keywords: Data processing, Experimental test of gravitational theories, Spaceborne instruments 


I. INTRODUCTION 

Situations where series of measurements, ideally reg¬ 
ularly sampled, suffer from short interruptions are com¬ 
mon in a wide range of applications and experimental 
set-ups. It is also usual to perform linear regression anal¬ 
ysis of data samples, in order to estimate parameters of 
interest by fitting other data series to the measured sig¬ 
nals. In particular, this is a typical scenario for space 
missions in fundamental physics such as MICROSCOPE 
m and LISA Pathfinder [5]. Long time integrations are 
needed by these experiments to reach the required signal- 
to-noise ratios (SNR) or the required levels of free-fall at 
the frequencies of interest. The duration of such mea¬ 
surements increases the probability to have invalid data 
in the integration period. It has been found that gaps 
could arise in the time series measured by the accelerom¬ 
eters carried on-board the MICROSCOPE satellite, and 
that those gaps could have substantial impact on the out¬ 
come of the regression when data is noisy. 

Here “gaps” refers to either lack of data or unusable 
information such as saturations and outliers during short 
or long time spans, which are eventually discarded. In the 
case of the MICROSCOPE space mission, discontinuities 
in the data availability could be due to data losses in the 
telemetry transmission, while data alteration could be 
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the consequence of three main identified causes: crack¬ 
les in the cold gas tanks triggered by decreasing pres¬ 
sure as they empty, crackles in the multi-layer insulation 
(MLI) coating due to temperature variations in flight, or 
micro-meteorites impacts. All saturated data are clearly 
identified by a flagging system in the telemetry. 

The objective of the MICROSCOPE signal processing 
can be regarded as rather general. It consists in detecting 
and estimating the amplitude of a periodic signal present 
in some measured time series. In the studied case the sig¬ 
nal is the signature of a possible violation of the Weak 
Equivalence Principle (WEP), as detailed later, and is ex¬ 
pected to arise around a certain frequency that we denote 
/ep- The amplitude to be estimated is the “EP param¬ 
eter”, denoted S. In previous works |4j the data analysis 
had been optimized in order to minimize the projection of 
possible unknown harmonic perturbations onto the sig¬ 
nal of interest by an appropriate tuning of its frequency 
/ep and/or the integration duration, in particular in the 
case of missing data. At the time, instrumental noise 
had been disregarded in order to exclusively deal with 
projection effects. Here we rather focus on the impact of 
missing data on the noise affecting the estimation. 

While the proposed approach is applied to MICRO¬ 
SCOPE simulated data, it leads to provide a robust 
method to estimate one or several deterministic compo¬ 
nents in the general context of time series with missing 
data affected by unknown colored noise. Although we 
have physical models of the expected noise spectrum, we 
assume in this study that it is not known a priori , allow- 
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ing us to cope with the most general situation. 

We show that noise distortions due to missing data 
points may dramatically increase the uncertainty of the 
estimation. This is due to the convolution effect between 
the observation window and the original noise spectrum, 
which leads to a leakage of the frequencies where the 
power is high to the frequencies where the power is low. 

Methods such as Ordinary Least Squares (OLS) or 
equivalently Lomb-Scargle periodogram M, as well as 
CLEAN-like algorithms [71, may fail in retrieving the re¬ 
quired precision 011, mainly because these approaches 
rely on a white noise assumption. In order to increase the 
precision of the fit, the noise correlation matrix must also 
be estimated. A general approach is to maximize the like¬ 
lihood function with respect to both regression parame¬ 
ters (6 in our problem) and the noise correlation matrix. 
Such an approach can use the Expectation-Minimization 
(EM) procedure like MAPES algorithms [TD] . However, 
their convergence may be very slow, especially for large 
data samples like in the MICROSCOPE case (about 10 6 
points). More recent works also use least squares iter¬ 
ative adaptive approaches (IAA) to estimate harmonic 
and noise parameters iteratively m , but require to store 
and invert correlation matrices, which is computationally 
expensive with an observation vector of 10 6 entries. Like¬ 
wise, the authors of the last two techniques do not present 
applications with colored noise. Some methods are al¬ 
ready implemented to extract unknown colored spectral 
densities, especially in the domain of gravitational waves 
detection (see for example but they do not 

tackle the problem of gapped time series. A suitable 
method is thus developed to estimate the EP parameter 
in case of missing data. 

Another type of algorithms referred to as “inpainting” 
techniques is based on a sparsity-prior to fill the gaps 
[23 US]. Their adaptation to general noise spectra is 
currently studied in the MICROSCOPE team (Berge et 
al ., in prep.). We rather focus here on an approach that 
avoids filling the gaps. 

We develop a method with two successive objectives. 
The first one is to reach the order of magnitude of the 
original (i.e. complete data) uncertainty in the estima¬ 
tion of the amplitudes of the deterministic components 
we are looking for. The second objective is to theoret¬ 
ically quantify the improvement on the variance of the 
estimator, using an approach that does not require to fill 
in the data gaps. 

Our technique is based on the estimation of the noise 
spectrum by using a liigh-order autoregressive (AR) 
model. The result is used to weight the data through an 
orthogonalization of the covariance matrix. This leads 
to an approximation of the best estimator in the sense of 
the variance, also referred to as the Best Linear Unbiased 
Estimator (BLUE) which is also the Generalized Least 
Squares (GLS) estimator in a linear regression context. 
The main idea in the proposed approach is to separately 
estimate the noise coefficients and the regression parame¬ 
ters instead of jointly estimating all the parameters. This 


is done in an iterative procedure that avoids the use of 
non-linear optimization algorithms. 

The proposed approach, that we dub “Kalman-AR 
Model Analysis” or “KARMA” for short, is divided in 
three steps. The first step consists in estimating the AR 
parameters describing the noise. This is done by using 
Burg’s algorithm adapted to discontinuous data [TT]. The 
second step is carried out via a Kalman filter algorithm 
based on the AR model that allows us to compute the 
weights, as shown by Jones [H]. In the third step we fi¬ 
nally compute an approximation of the Generalized Least 
Squares (GLS) estimator of the regression parameters, in 
a way similar to maximum likelihood computation meth¬ 
ods applied to regression models pranuj. These steps 
can be reproduced to converge to the maximum likeli¬ 
hood estimator (MLE) of the parameters. 

In this paper, we first analyze the effect of the missing 
data pattern on the estimation uncertainty (section |Tl|. 
We then describe the KARMA method (section [TTT] ) and 
we present a way of evaluating its performance, allow¬ 
ing us to give a criteria for the detection of the searched 
signal (section IV). Finally, after a brief description of 
the mission context, we apply this technique to MICRO¬ 
SCOPE simulated time series, in particular to data sam¬ 
ples generated with the mission and instrument simulator 
(section |v|). In section VI we discuss the results. 


II. IMPACT OF MISSING DATA 

Although we apply our study to the MICROSCOPE 
data analysis, it can be viewed as a general regression 
problem. The measurement equation can be summarized 
as follows: 


7 = Jsep + / ] a-iSp 


(1) 


where 7 is the V-points complete measurement vector 
defined as 7 = (70 ... 7 at-i) , and 6 and «ep are re¬ 
spectively the parameter and the signal of interest (the 
EP parameter and the EP violation signal for our pur¬ 
pose). 

The second term accounts for possible perturbations, 
whose amplitudes a.i should also be estimated to reject 
any bias. 

The third term is the residual noise vector z assumed 
to be a zero-mean Gaussian random vector. The main 
objective is the estimation of 5, for which the square root 
of the one-sided noise power spectral density (PSD) at 
EP frequency must be 1.4 x 10 ~ 12 ms“ 2 /-\/Hz pQ. 

The presence of missing or corrupted data in the time 
series is identified by a mask vector w which is equal to 
1 when the data is available and 0 otherwise, regardless 
of the nature of the gap. The observed signal is thus the 
vector y with entries y n = w n "f n . We assume that the 
loss of data arises before any possible filtering. 
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A. Impact on the PSD 

We briefly derive the impact of the observation window 
w on the PSD of a pure stationary random signal. Thus 
in this section we assume 7 = z. The real signal 7 is 
regularly sampled at a frequency f s so that 7 n = ”f{n/ f s ). 

For a stationary discrete parameter process, the auto¬ 
covariance function is defined as: 

Ry(k) = ^[VnUn+k] ~ E'[?/ Jl ]E[t/ rl _)_fc]. (2) 

Then the PSD is the Discrete-Time Fourier Transform 
(DTFT) of the autocovariance [2lj : 

1 +00 

S v (f) = T E Ry(k)e~ 2 ^ kf/f °. (3) 

Js , 

K= — OO 

In the case of the masked noise y n = w n z n , Eq. ([ 2 ]) gives: 

R y (k) = E [w n z n w n+k Zn+k] - E[w n z n ]F,[w n+k z n+ k}. 

We assume that the underlying process in 2 is indepen¬ 
dent of the window w, and that w is a stationary process, 
so that one can write: 

Ry{k) E[ 2 rl 2 y l -{-fc]E[xc n '(c n -|_/ l .] y, z y w 

— (i?z(fc) + y-z) (R"w{k) + y ul ) — (4) 

where for any random variables x we note y x its expec¬ 
tation. 

Assuming that z n is a zero-mean process (y z = 0), the 
PSD of the windowed signal is obtained by taking the 
DFT of Eq. @: 

S y (f) = y 2 w S z (f) + [S w *S z ](f), (5) 

where * is the convolution operator. 

The first term can be viewed as a loss of power due to 
the missing data and the second term accounts for the 
frequency leakage. In the case of uniform random gaps, 
one shows (see appendix[A]) that y, w is equal to the proba¬ 
bility to have a gap at a given time. Then S w (f) is a con¬ 
stant, and the leakage term is proportional to the mean 
power. Therefore, the noise will increase significantly at 
frequencies where the leakage term is dominant. 

As an illustration, a simulation of the MICROSCOPE 
instrumental noise alone is presented in Fig. |T] The noise 
is generated using an approximate PSD model, taking 
into account thermal sensitivities at lower frequencies, 
position sensor noise at higher frequencies, random noise 
of the pick-up circuitry and the frequency response of the 
control loop: 

2 S z (f) = a 2 z + (£) + (£) ^ • \H cl (f)\ 2 (6) 

with a z = 1.4 x 10~ 13 ms" 2 /VHz, fi = 8.1 x 10~ 2 Hz 
and /2 = 1.3 x 10 -2 Hz. H c i is the transfer function 


of the closed control loop of the accelerometer. It has 
almost a unit gain for all frequencies under 1 Hz, and 
induces a slight inflection in higher frequencies. The fac¬ 
tor of 2 accounts for the fact that S z (f) is the two-sided 
PSD. The data is sampled at a frequency f s = 4 Hz 
on a duration T = 1.4 days corresponding to 20 satel¬ 
lite orbits with N g = 5200 gaps of the same length 
(0.5 seconds), randomly distributed over the time series. 
We observe a transfer of power from high frequencies 
to low frequencies, increasing the apparent noise around 
/ep = 9.4 x 10 ~ 4 Hz by two orders of magnitude. 



Frequency [Hz] 


FIG. 1. Periodogram of original (black) and incomplete (grey) 
time series with 0.5 second data gaps randomly distributed in 
a 20 orbits session. The simulation is done for 260 random 
gaps per orbit. 


B. Impact on the least squares estimate 

We now demonstrate that the observed increase of the 
noise is not a simple artifact of the Fourier representa¬ 
tion but directly impacts the estimation uncertainty in a 
least squares fitting approach. We assume that the ana¬ 
lyzed signal is the sum of a harmonic component Sep at 
frequency /ep and a correlated Gaussian random noise 
2 . For the sake of simplicity, we ignore the presence of 
possible deterministic perturbations, therefore the signals 
Spy’s in Eq. ([!]) are all zero. The signal is still sampled 
at frequency f s on N data points. Thus the signal reads: 

7 = <5s E p + 2 - (7) 

We define the window matrix as the diago¬ 
nal matrix formed by the window vector: W = 
diag (wq ... wn- 1 ). We aim at calculating the variance 
of the OLS estimate that only uses the available data (at 
times for which w n = 1). In the least squares formal¬ 
ism, this is equivalent to studying the windowed vector 
y = W'f. We also define the model matrix A. Although 
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it can take a general form including various signals, we 
assume here that it contains the EP signal model only 
such that A = s EP . We also define the masked model 
matrix A w = WA. The usual OLS formulas give the 


following parameter estimate: 


5 = (A w ^ A u ,) • A w ^ y , 

(8) 

as well as its variance: 


Var (5) = K~ 1 A W ^T, V A W K~ 1 , 

(9) 


where we defined K = A w 'A w and E y = WE Z W^ with 
Yi z = E[zz’l], the covariance matrix of the noise vector. 
Here f denotes the hermitian adjoint. As a result, the 
noise correlation seen by the estimator is E v instead of 
E„ in the complete case. 

In the case of a stationary Gaussian random noise the 
estimator covariance can be diagonalized in the Fourier 
space: 


E y = ^M^DM, (10) 

where D is the diagonal matrix formed by the two-sided 

A A r p 

discrete PSD: D = diag (S 0 • ■ • Sn- i) and M is the 
Discrete Fourier Transform (DFT) matrix with coeffi¬ 
cients: Mki = exp (— The discrete spectrum is 

defined as the expectation of the periodogram. It can be 
seen as an approximation of the real PSD [21]: 

Sk=y E (l-^)Ry(n)e~ 2 ^. (11) 

■'* n=—(N—l) ' ' 


This diagonalization thus links the estimator vari¬ 
ance and the PSD of the windowed noise. By devel¬ 
oping Eq. © we show (see appendix [B] for more de¬ 
tails) that in the case of a harmonic model such as 
sep.ii = 7ep sin(27m/ E p// s ), for sufficiently large N, the 
estimator variance is approximately equal to: 


Var(<5) 


2f s NS y (fEp) 

-^o7ep 


( 12 ) 


where S y is given by equation Q, N 0 = N — N g is the 
number of observed data and y E p is the amplitude of 
the model, which is the gravitational acceleration in our 
case. As a result, in presence of missing data, the esti¬ 
mation variance increases proportionally to the leakage 
term in Eq. ©• To quantify the increase of the uncer¬ 
tainty, we plot the standard deviation of the estimator as 
a function of the number of data gaps per orbit in Fig. 
[2j in the case of short random gaps of fixed length (0.5 
second) uniformly distributed over the time series (the 
effect of the size and the number of gaps is discussed in 
Berge et al, in prep.). The theoretical standard devia¬ 
tion (black curve) is obtained using Eq. ©• In order to 
check the correctness of the distribution, we also plot the 
sample standard deviation of 400 estimates (red curve) 


corresponding to different realizations of the noise vec¬ 
tor z. This shows that the uncertainty grows by one 
order of magnitude from 10 gaps per orbit only, which 
represents a data loss of 0.04 %. This is not acceptable 
with respect to the performance objectives of the mis¬ 
sion. Therefore an alternative estimation method needs 
to be implemented. 



Number of gaps per orbit 


FIG. 2. Theoretical (black) and sample (red) standard de¬ 
viations of the original least squares estimate of the EP pa¬ 
rameter as a function of the number of gaps per orbits. All 
gaps have the same duration of 0.5 second and are randomly 
distributed over a 20 orbits session. 


III. KALMAN-AR MODEL ANALYSIS 
(KARMA) 

The poor performance of the OLS estimator is due to 
the fact that its variance is not minimal. To minimize the 
variance, the Best Linear Unbiased Estimator (BLUE) 
is needed, which takes the form of a Generalized Least 
Squares (GLS) estimator in linear regression problems. 
In case of missing data, it reads: 

/3 = (AtEu'A)- 1 • AjE 0 ~ l y 0 , (13) 

where (3 is the q x 1 vector of parameters to be estimated. 
The observation vector y a = (j no ... 7 njVo - 1 ) gathers 
the available data only, that is, no,... ,n,N 0 -i are the 
time indexes corresponding to the observed data. Sim¬ 
ilarly, A a is the model matrix where we have kept only 
rows corresponding to observed data. Here A 0 is assumed 
to be general, of size N x q. E 0 is the covariance matrix 
of the observed noise vector z a and admits a Cholesky 
decomposition such that E 0 = L 0 Lj where L a is a lower 
triangular matrix. 

The difficulty here is to estimate the noise covariance 
matrix E 0 in spite of the missing data. The method 
that we propose consists in calculating an approximation 
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of the GLS estimator by postulating an autoregressive 
(AR) model for the noise. This is done in three steps 
which are detailed below: estimation of AR parameters 
(step 1), calculation of the whitened vectors L 0 ~ 1 y 0 and 
L 0 ~ l A 0 (step 2), calculation of the estimate 0 (step 3). 
The process may be iterated if necessary. 


A. Step 1: AR parameters estimation 


The first step is to estimate the noise characteristics 
encapsulated in the covariance matrix £ 2 . To do so, we 
assume that the noise process can be described by an 
autoregressive model of some order p to be determined, 
verifying the following relation at all times n: 


~n + O’lZn-l + ••• + UpZn-p — f-m (14) 

where a \,..., a p are the AR coefficients and e is a zero- 
mean white Gaussian random field of variance a 2 . Note 
that this is equivalent to approximating the noise PSD 
with a rational function, the numerator being a polyno¬ 
mial of degree p in exp(— 2inf / f s ) such as: 


&(/) = 


V 2 /fs 


1 + aie- 2 ”D/A + ... + cLpe-^pf/fs 


(15) 


The choice of this model is motivated by the follow¬ 
ing arguments. Firstly the use of a parametric model 
consistently reduces the number of noise parameters to 
estimate (p instead of N), and therefrom the computa¬ 
tional cost. Secondly choosing an AR model rather than 
a more general class such as autoregressive-moving av¬ 
erage models (AR.MA) allows us to easily estimate the 
parameters from the discontinuous data, while ARMA 
models usually involve computationally expensive opti¬ 
mization procedures, or direct estimation of the autoco¬ 
variance function which is not accurate when data are 
missing. Furthermore any moving-average model can be 
approximated by a high order AR model as discussed by 
Durbin (22] . 

The AR parameters 6 = ( ai,a 2 ) are estimated thanks 
to Burg’s algorithm adapted to the missing data case PH- 
This technique relies on the minimization of forward and 
backward residuals of the model (141 through a recursive 


procedure that increases the order k of the AR model 
at each step, until k reaches p. This algorithm takes 
advantage of all segments of available data. For a given 
order k , only the segments of size N s > k can be used for 
the estimation. Note that the proper AR order must be 
previously determined according to some criteria such as 
Akaike’s, as discussed later. 

For the first iteration, the AR estimation is performed 
on the residuals of the OLS estimation z a = y a — A 0 (3 ols 
instead of y 0l where /3ols is the result of the simple 
estimate given by Eq. This reduces the disturbance 
of deterministic components onto the estimation of the 
noise parameters. 


B. Step 2: computation of the weighted vectors 
with the Kalman filter 


The determination of the AR parameters gives access 
to the noise autocovariance function. The aim of this 
step is to use this result to calculate the weighted obser¬ 
vation vector L~ 1 y 0 and weighted model matrix L~ 1 A a 
involved in the expression of the estimator (13). The ma¬ 
trix L 0 indirectly depends on the AR parameters via the 
autocovariance function, since: 


S„(m, l) = R z (| n m - rn\) \/(m,l) G [0, N a - l] 2 , (16) 


where the autocovariance function R z is estimated by 
taking the inverse Fourier transform of Eq. (151. 

Unlike the case of complete stationary random series, 
the observed data in a missing pattern do not have a cir- 
culant nor Toeplitz correlation matrix, because the nt s 
are not regularly arranged. Therefore the matrix £ 0 can¬ 
not be inverted by efficient techniques such as Levinson 
or FFT algorithms. If the data sample is large (like in 
the MICROSCOPE case where typically N ~ 10 6 ), this 
creates memory difficulties to store such a matrix. That 
is why we present a way of avoiding the direct inversion 
using a Kalman algorithm to compute the weighted data. 

The relationship between GLS and Kalman filtering is 
explained as follows. Following the notation of Gomez 
and Maravall [.2Uj, an AR process can be described by 
the state-space representation: 


x(n) = Fx{n — 1) + Ge(ro), (17) 

z n = H T x{n). (18) 

The above equations are the state equation and the ob¬ 
servation equation of the Kalman Filter. x(n) is the state 
vector at time n, defined by: 

x(jl ) = ( Z n Z n -j- i| n . . . , 

where z n +fc \ n is the conditional expectation of z n +k given 
the observations before time n. H is the matrix linking 
the state vector to the observations, and simply reads 
H = (l 0 ... 0) . The model matrix F and the model 
noise vector G are calculated from the AR parameters 
and are defined in appendix [C] 

The Kalman filter aims at calculating the a priori es¬ 
timate of the state vector along with its variance at each 
time n given all the observations until time n — 1, that 
is: 


Zn\n-1 = E[z n |2:o, Z u . . . , 2 n -l], (19) 

®n\n— i = Var[z n |zo, Z 1 1 • ' • j z n— i]* (20) 

We define the normalized innovation vector e whose 
elements are calculated with the Kalman residuals and 
their standard errors: 


— (z n Z n j n —i ) /&n\n—l' 


( 21 ) 


Since z n |„_i is actually the projection of z n onto the sub¬ 
space generated by (zq • ■ • Zn- 1 ), Eq. (211 is equivalent 
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to a Gram-Schmidt orthogonalization procedure. As a 
result, the e n ’s are uncorrelated. In addition, z n i„_i is 
a linear combination of Zi, i < n, thus the normalized 
innovation vector can be expressed as: 


e = Tz, 


( 22 ) 


where T is a lower triangular matrix with diagonal ele¬ 
ments equal to one. If we calculate the autocovariance of 
Eq. (22) we find that 

Cov[e] = TE z T t => E z = (TT t ) _1 , (23) 


where the implication is based on the fact that Cov[e] is 
equal to the identity matrix. This last equation shows 
that the matrix T is equal or proportional to the inverse 
of the Cholesky decomposition L -1 of the covariance ma¬ 
trix E~. However, in our problem this is not exactly true. 
The derived equalities are only valid if the random data 
truly follows the AR process, which is not the case in our 
approach since the AR model is just an approximation 
of the real underlying random process. We thus assume 
that the Kalman output e is only approximately equal 
to L~ l z. 

If data are missing, the classic Kalman procedure must 
be slightly modified to properly deal with missing data, 
as explained by Jones he but the components of the 
normalized innovation vector e corresponding to missing 
data are ignored in the estimation at step 3. 


give a confidence threshold for the detection of the signal 
of interest. To achieve this goal, the estimator variance 
matrix must be estimated. 

The correlation matrix can be approximated under the 
assumption that the AR model is a good approximation 
of the real noise correlations. This hypothesis is equiva¬ 
lent to assuming that the estimator has minimal variance 
( i.e . that the estimator is the BLUE). Let C be the co- 
variance matrix of the estimator (3. Then Eq. ^ gives, 
by replacing IT by T a and A by A a : 

Cttal(EjE o y\ (25) 

where <jq accounts for the fact that the covariance is 
known up to a proportionality constant. For an unbi¬ 
ased estimator (i.e. the model matrix A a describes all 
the deterministic components of the signal) this can be 
estimated by: 


e\,e z 

N^q 1 


(26) 


where e z is the vector of weighted residuals defined by 
e z = e Q — Ed3. The statistic to be considered is: 


Zk = 



(27) 


C. Step 3: computation of the GLS estimate 


In the previous paragraph we showed how to perform a 
quasi orthogonalization of the observation vector, which 
is exactly what is needed to compute an approximate 
version of the Generalized Least Squares (GLS) estimate. 

The estimator in Eq. (13) can be rewritten: 


(3 = (E 0 'E 0 ) 


-1 


Ejt 


(24) 


where, with obvious notation, we denote the normalized 
innovation vectors e D = T 0 y 0 and E a = T 0 A 0l calcu¬ 
lated with the outputs of the Kalman filter algorithm, 
respectively applied to the observed signal and to each 
columns of the model matrix. Both vectors are obtained 
by keeping elements corresponding to observed data only. 
The Kalman algorithm is thus used here as a device to 
compute the weighted vectors involved in the GLS. 


where k is the index corresponding to the parameter of 
interest in the vector /3. For our application 0%. is the 
EP parameter S. Here we assume that the underlying 
process is Gaussian, which is reasonable in the case of 
the MICROSCOPE instrumental noise. Then under the 
assumption that there is no violation signal (hypothesis 
Hq), Z approximately follows a Normal law with mean 
zero and unit variance. A detection threshold with a 
(1 — a)-confidence level is given by imposing that the 
probability to observe a value above the threshold, un¬ 
der Hq, must be lower than a. This gives the threshold 
2 = 4*" 1 (l — ^), where $ is the Normal Cumulative Dis¬ 
tribution Function (CDF). Therefore if \Z\ is above the 
threshold, then a signal is detected with a confidence of 
100(1 — a)%. Conversely, for a given estimation of the EP 
signal, the violation can be claimed at a 100(2 &(Z) —1)% 
confidence level. Typically, a 99% confidence test re¬ 
quires z = 2.56. 


IV. THEORETICAL UNCERTAINTY AND 
DETECTION ISSUES 

This is of key interest to be able to assess the statis¬ 
tical uncertainty of a given estimation, especially in a 
context where the experiment cannot be reproduced a 
large number of times. In this section we present a tool 
to quantify the uncertainty of the regression result and to 


V. APPLICATION TO SIMULATED DATA OF 
THE MICROSCOPE MISSION 

A. The MICROSCOPE experiment 

The Weak Equivalence Principle (WEP) is at the basis 
of General Relativity. Its concrete manifestation is the 
Universality of Free Fall, stating that a body in a gravita¬ 
tional potential is accelerated independently of its mass 
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and internal composition. Current efforts to build new 
unification theories may call this principle into question 
[23], postulating the existence of additional fundamental 
interactions. To provide an experimental discrimination 
of these theories, the goal of the MICROSCOPE space 
mission is to test the WEP within a precision of about 
10~ 15 never reached by previous ground-based experi¬ 
ments m E5j- This space-borne test takes advantage 
of the duration of the fall by integrating the data over 
several orbits. 

The mission payload is an ensemble of two electrostatic 
differential accelerometers composed by a cage contain¬ 
ing two cylindrical and co-axial test-masses (TM). One 
accelerometer is devoted to the EP test, while the other 
serves as a reference. In the first accelerometer, the two 
TM have different compositions: one is made of Platinum 
Rhodium alloy (PtRh) and the other of Titanium alloy 
(TA 6 V) [22]. In the second accelerometer the TM are 
both made with PtRh. The masses, whose potential is 
kept constant via a thin gold wire, are servo-controlled by 
a set of electrodes to follow the same trajectory. The MI¬ 
CROSCOPE science signal is the difference between the 
accelerations applied to the two TM, which are deduced 
from the applied electrostatic forces needed to maintain 
them relatively motionless at the center of the cage. A 
drag-free system ensures that the measured common ac¬ 
celeration, i.e. the mean acceleration of the two TM, 
is nullified. A violation of the WEP would result in a 
difference between the two measured accelerations. 

The violation signal is expected to be periodic with 
a frequency /ep because of the projection of the gravi¬ 
tational acceleration onto the science axis of the instru¬ 
ment during the orbital trajectory. For a satellite inertial 
pointing session, /ep is equal to the orbital frequency. 
For a slowly rotating satellite in the orbital plane, this is 
equal to the sum of the orbital frequency and the satellite 
spin frequency. The duration of each session is chosen in 
order to reach a standard deviation error of about 10 -15 
on the EP parameter S, which is almost equal to the 
Eotvos parameter. The inertial and spin sessions last re¬ 
spectively 120 and 20 orbits. The specificity of the data 
samples to be analyzed in the MICROSCOPE mission is 
that the signal of interest has a low signal-to-noise ratio 
(SNR) that lies at low frequencies (10 ~ 4 - 10 ~ 3 Hz) in a 
time series with a broad frequency range (10 ~ 5 - 2 Hz), 
blurred by a colored noise containing most of its power 
in higher frequencies (above 10 -1 Hz). In addition, long 
time series must be analyzed to achieve a sufficient SNR, 
including about 5 x 10 5 data points for a spin session. 


B. Considered data sets 

We apply the KARMA method to a time series sim¬ 
ulated with a mission simulator. The simulation output 
is the differential acceleration vector 7 equal to the ac¬ 
celeration difference between the two masses. This time 
series is sampled at f s = 4 Hz and lasts 20 orbits. This 


corresponds to a spin session, for which the orbital fre¬ 
quency is equal to 1.7 x 10 -4 Hz and the spin frequency 
is 7.7 x ICC 4 Hz. The EP frequency is then equal to the 
sum /ep = 9.4 x 10 ~ 4 Hz. 

In addition to the signal of interest, other perturba¬ 
tions are present in the measurement as indicated in Eq. 
0. They are mainly due to gradient terms between the 
center of mass of the two TM, the relative motion of the 
TM, and coupling with the common mode because of 
instrument defects. During the experiment, the instru¬ 
ment or the satellite undergoes excitations that favor the 
SNR to measure their amplitudes The corresponding 
accelerations s p ^ are either modeled or measured, such 
that the perturbations can be subtracted from Eq. 0. 

Nevertheless, in this simulation we allow for the pres¬ 
ence of gradient perturbations. They come from the 
slight off-centering of the test-masses, leading to gravity 
and inertia gradient terms. In the simulation we assume 
that the TM are off-centered by 20 microns along the x 
and 2 -axis which are in the orbital plane. Note that al¬ 
though an off-centering along the y-axis can also exist, 
it is estimated by means of dedicated calibration sessions 
and corrected numerically before the EP estimation. The 
EP parameter is simulated at a level of 3 x ICC 15 . Thus 
the regression model A contains the true acceleration sig¬ 
nal g x (t ), to which we add the two perturbations modeled 
with our knowledge of gravity and inertia gradients. The 
noise added to the data is generated from the PSD model 
given by Eq. 0. The signal model reads: 

7 (t) = Sg x (t) + A X^XX (t) + A z T xz (t) + z(t), (28) 

where we have noted g x the gravitational acceleration 
projected onto the x-axis, Tij the components of the gra¬ 
dient tensor, and the off-centerings. Thus in this case 
there are three regression parameters: <5, A x and A z . 

We consider two types of gap pattern. The first one 
is a “tank crackle type” window w a that is generated so 
that all gaps are of equal duration (0.5 second) and their 
positions are randomly distributed on the sample (uni¬ 
form distribution with 260 gaps per orbit). The second 
one is a “telemetry losses type” window wt where the 
gaps durations are drawn from a distribution similar to 
the telemetry thread of the PICARD mission m, with 
a standard duration of one minute. Their positions are 
distributed in the same way as for the first window. Each 
window represents the same fraction of missing data, of 
about 2%. Thus window w a comprises more gaps than 
window Wb (larger N g ) but gaps are shorter in average. 
To illustrate this, we plot in Fig. [3] an extract of the 
time series where the data interruptions of each window 
are identified by vertical grey bars. 

We apply the KARMA method and compare the result 
to the Ordinary Least Squares estimate with missing data 
to assess the improvement. We also compare the result 
to the reference given by the OLS estimator in the case 
without gaps. 



— Signal — Interruptions 
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FIG. 3. Fraction of the temporal series (black) with the in¬ 
terruption times represented by the grey vertical lines for the 
two windows w a (top) and Wb (bottom). 

C. PSD estimate from the AR fit 

Before starting the whole process, the order of the 
AR model must be chosen at step 1. The choice of 
the order depends on the PSD of the noise affecting 
the measurement, and on the observation window. A 
way to properly choose the order p is to minimize the 
Akaikc’s Information Criterion [28] defined as AIC(p) = 
2p — 2 log (L max (p)), where L max is the maximized log- 
likelihood. In the case of an AR model, this can be ex¬ 
pressed in terms of the estimate of the AR residual vari¬ 
ance <j 2 which is directly computable from the residuals 
of the Burg’s algorithm: 

AIC(p) = 2p + N 0 log(<j 2 ). (29) 

Applying Burg’s algorithm to the residual series z defined 
in section [ill A| with increasing order k allows us to find 
the order that minimizes the AIC. 

In the MICROSCOPE case, AIC(p) is an asymptoti¬ 
cally monotonic decreasing function. In this configura¬ 
tion one possibility is to choose the order p from which 
there is no significant improvement in the AIC, i.e. the 
minimum order where the AIC is close enough to the 
asymptote. For the studied noise, the AIC typically 
reaches a plateau from p = 200. 

Nevertheless, in case of very frequent missing data (e.g. 
tank crackles), the variance of the AR coefficients esti¬ 
mates increases with the order, and so does the variance 
of the AIC. This is due to the decrease of the number of 
usable data segments (with length larger than p). This 


can lead to overestimating the optimal order p. To over¬ 
come this difficulty we can modify the AIC criterion as 
suggested by Bos et al. [29] by introducing a penalty 
accounting for the increasing estimation variance. We 

choose to replace N a by p w) where is the 

number of usable segments to estimate the coefficient cq. 
When applying this criterion to our simulation with win¬ 
dow w a , we find an optimal order of p = 60. 

The process converges after 2 iterations, because the 
first estimate of the PSD is influenced by the high ampli¬ 
tude perturbations of the gradient terms: the main peak 
has an amplitude of 2.4 x 10 -11 ms -2 at 2/ep, and other 
peaks are present at f or b and 2f or b- In comparison, the 

EP violation corresponds to an amplitude of 1.2 x 10 -14 

_2 

ms . 

We plot in Fig. [4] the estimate of the PSD (red curve) 
obtained with the AR coefficients calculated by Burg’s 
algorithm with the tank crackles (a) and telemetry (b) 
windows, along with the real PSD (black curve). The 
level of noise of the masked data is shown by the black 
dotted line. In addition to the selected order p = 60, we 
also show the AR spectrum estimate made with a larger 
order ( p = 200 with window w a , and p = 2000 with 
window wj,) to illustrate the effect of p. In both cases, 
we see that the overall shape of the PSD is well described 
by the AR model, especially the / 4 slope. However, there 
is a bias which increases as the frequency decreases. The 
reasons why the AR model cannot accurately describe 
the low frequency PSD are two-fold: 

1. The order of the AR model is finite, and limited by 
the longest segment of consecutive available data 
(this is typically 700 for the window w a , and 50000 
for Wb). Given that AR models cannot describe 
1// spectra with a finite number of parameters, a 
larger order is necessary to reduce the bias (and the 
bias is zero when p tends to infinity). 

2. In the Burg estimation procedure, the larger the 
AR order, the larger the variance of the AR coeffi¬ 
cients estimates, because there are fewer segments 
of corresponding lengths. This is why we do not 
choose the highest possible order, for which seg¬ 
ments of corresponding length are rare. 

Since window Wb has more spaced and longer gaps than 
window w a , it allows for a higher possible AR order lead¬ 
ing to a better restitution of the low frequency shape of 
the PSD, with a reasonable variance (see Fig. [4]). How¬ 
ever we choose p = 60 even in the case of window Wb 
for computational reasons, given that this is the high fre¬ 
quency restitution that matters for a parameter regres¬ 
sion purpose, as we shall see in the next paragraph. 

D. Regression results 

The results of the linear regression are summarized in 
Table [T] with p = 60. In order to test the precision of our 
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(a) Tank crackles gaps 


(b) Telemetry losses 


FIG. 4. PSD estimates of the noise in presence of missing data. The black dashed curve is an estimate of the masked data PSD 
[obtained using Eq. i>] , the black solid curve is the actual noise PSD, and the red and blue curves are the PSD estimates of 
the AR model obtained with Burg’s algorithm. The periodograms of the regression residuals are also plotted for the complete 
(dark grey) and masked (light grey) cases. 


method, we have drawn 400 realizations of the noise and 
run our estimation algorithm for each of them, as well as 
the OLS estimator. The number of draws is chosen such 
that the error on the true value of the standard deviation 
of the EP parameter does not exceed 10~ 16 with a 99% 
confidence. 

The third column of the table indicates the true value 
of the parameters. Columns 4 to 6 show the performance 
of the OLS estimator: the sample average p, the theoret¬ 
ical standard deviation ath given by Eq. ([9]) and calcu¬ 
lated with the real PSD, and the sample standard devi¬ 
ation of the 400 estimates. The last three columns show 
the results obtained with the KARMA method, and are 
detailed below. 

The sample mean p of the estimates obtained with 
the KARMA method converges to the true value of the 
parameters (seventh column of table [T|, showing that the 
constructed estimator is unbiased. 

We also calculate the sample standard deviation of the 
EP parameter. For short and numerous gaps (tank crack¬ 
les window) we find a = 1.1 x 10~ 15 with our method 
instead of 6.5 x 10 -14 with the OLS estimator. Thus 
our method enables us to divide the stochastic error by 
a factor 60 with respect to the OLS. 

For fewer and longer gaps (telemetry window), we find 
a = 9.8 x 10~ 16 instead of 5.1 x 10 -15 with the OLS. 
We notice that such a gap pattern has less impact on 
the estimation performance, because it leads to a lower 
frequency leakage as confirmed by Eqs. (A31 and (A4) 
of appendix |A| (also see Berge et al ., in prep.). 

These are satisfying results since the theoretical uncer¬ 
tainty of the OLS without any missing data is equal to 
9.6 x 10 -16 . The detection test is positive with a confi¬ 
dence greater than 99% in both cases. 


The improvement is also significant for the other pa¬ 
rameters. Even if they are already well estimated by the 
OLS, their uncertainty is reduced by almost two orders 
of magnitude for the tank crackles window. 

For each draw, we estimate the uncertainty < tar us¬ 
ing the approximate formula (25). We then calculate 
the sample average of this estimate over the 400 draws, 
and record the results in the table. We find 1.2 x 10 -15 
for window w a and 9.3 x 10~ 16 for window Wb- This is 
close to the calculated sample standard deviation, mean¬ 
ing that when having only one realization at hand, one 
can estimate the error with an acceptable accuracy. The 
estimated error does not vary much from one estimation 
to another, and stays within an interval of ±10~ 16 around 
the mean. 

The estimate &ar of the real regression error may be 
biased, depending on the frequency of the estimated sig¬ 
nal. This can be explained by Fig. |4j where we observe 
that the PSD of the AR model is biased at low frequency. 
As a result, the lower the signal frequency, the larger the 
bias on the estimated variance. This is particularly true 
around zero, where the AR PSD is below the real one. 
However, the overall shape of the real PSD is well cap¬ 
tured by the AR model, which is enough to cancel the 
leakage due to the window and get a precision of 1 x 10“ 15 
for the EP estimation, in agreement with the mission re¬ 
quirement. 


VI. CONCLUSION AND DISCUSSION 

We have shown that the presence of gaps in time se¬ 
ries affected by correlated noise has a strong impact on 
the classical Fourier analysis and on the precision of the 
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TABLE I. Mean and standard deviations on the estimation of the parameters of interest using OLS and the KARMA method. 
In both cases we present (from left to right) the estimation average calculated on a sample of 400 estimates, the analytical 
standard deviation, and the sample standard deviation. For OLS, the analytical uncertainty a t h is given by Eq. (|9j) , which is 
exact. For the KARMA method, &ar is the average of the uncertainties estimated for each draw with Eq. (|25|). 



Ordinary Least Squares 

Kalman-AR Model Analysis 

Window 

Par am. 

True 

A 

&th 

a 

A 

O AR 

<7 


5 [10 -ia ] 

3 

3.01 

0.96 

1.02 

2.98 

0.92 

0.96 

Complete data 

A x [pm] 

20 

20.0 

0.003 

0.005 

20.0 

0.004 

0.003 


A 2 [pm] 

20 

20.0 

0.003 

0.005 

20.0 

0.004 

0.003 


<5 [lCT 15 ] 

3 

8.82 

62.3 

65.2 

2.98 

1.19 

1.14 

Tank crackles 

A* [pm] 

20 

20.0 

0.290 

0.296 

20.0 

0.006 

0.004 


A z [pm] 

20 

20.0 

0.292 

0.314 

20.0 

0.006 

0.005 


5 [lCT 15 ] 

3 

3.15 

5.20 

5.07 

2.98 

0.93 

0.98 

Telemetry 

A x [pm] 

20 

20.0 

0.021 

0.021 

20.0 

0.004 

0.003 


A z [pm] 

20 

20.0 

0.024 

0.024 

20.0 

0.004 

0.003 


ordinary least squares fits of harmonic signals. This is 
due to the frequency leakage of the noise power, which 
can increase the uncertainty of the fit by several orders 
of magnitude, even if the percentage of missing data is 
small. 

We proposed a method that we dubbed “KARMA”, 
which provides a general way to perform precise linear 
regressions with large and incomplete data sets affected 
by unknown colored noise, and that we applied to mock 
MICROSCOPE data. The estimation variance is de¬ 
creased down to the same order of magnitude as the least 
squares estimator with full data, altered by the natural 
loss of signal due to the 1/y/N dependence. The method 
tends to approach the minimum variance estimator of 
the available data, by approximating the noise autoco¬ 
variance with a high order AR model. 

Our method uses a weighting of the data relying on the 
estimation of the shape of the PSD. As a result, the per¬ 
formance of the regression mainly depends on the ability 
of the autoregressive PSD estimate to recover the part of 
the spectrum that is responsible for the leakage, which 
is the high frequency part increasing as / 4 in the MI¬ 
CROSCOPE case. Although this is not shown here, the 
method has also been successfully tested in a case where 
the leaking power comes from a thermal l// 2 noise pro¬ 
jected onto high frequencies. The AR PSD then accu¬ 
rately fits the low frequency slope and allows us to im¬ 
prove the possible regression of high frequency compo¬ 
nents. 

In addition, the outputs allow us to evaluate the vari¬ 
ance of the estimator from a single estimation. We re¬ 
cover the magnitude of the true precision, equal to 10 -15 
in our MICROSCOPE illustration. The variance is not 
estimated with a better accuracy because of the low fre¬ 
quency bias of the AR PSD estimator. This bias depends 
on the missing data pattern, and more particularly on the 
length of the longest uninterrupted data segment, as well 
as the number of long segments. This determines the AR 
order to be chosen, resulting in a trade-off between the 


bias and the variance of the PSD estimate. 

Concerning the scientific objective of the MICRO¬ 
SCOPE mission, the above discussion demonstrates that 
based on the current noise model of the accelerometers, 
we will be able to get a 99% (resp. 68%)-confidence level 
detection of a 3 x 10 -15 (resp. 1 x 10 -15 ) EP violation 
signal, even in the presence of missing data, for a 20 
orbit-measurement session (completed in 1.4 days). The 
mission should include more than 70 sessions of this type, 
allowing for a detection at the 99% level even for an am¬ 
plitude of 1 x 10 -15 . This has been done for short and 
very frequent gaps to represent acceleration peaks or sat¬ 
urations due to MLI or tank crackles, as well as for longer 
and fewer gaps to simulate telemetry interruptions. 

Further developments will concentrate on how to in¬ 
crease the accuracy of the noise PSD estimate, for exam¬ 
ple by using the AR model to perform missing data im¬ 
putation. Indeed, although this is computationally more 
expensive, the AR model can be exploited in a Gaussian 
process regression approach [317 to estimate the missing 
values. 

Finally, there are two potential limitations to the pre¬ 
sented method that can be addressed in further exten¬ 
sions. On the one hand, although the AR model can 
be a good approximation to any PSD and can be fitted 
very efficiently, it is still a parametric an thus restrictive 
model. On the other hand, noise and signal parame¬ 
ters are estimated iteratively but separately, so that each 
step is done conditionally to the previous one. This may 
result in a loss of accuracy. As a result, a possible gen¬ 
eralization is to use the proposed method as an efficient 
initialization procedure for a more general regression al¬ 
gorithm that would maximize the full likelihood without 
any prior noise model. 
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Appendix A: PSD deformation in the case of 
random missing data patterns 


We derive here the PSD of the masked data in the case 
where the gaps positions in the time series are drawn 
from a uniform distribution. Let N be the length of 
the time series, N g the number of gaps, and the in¬ 
dices indicating the location of the beginning of each gap 
(such that w nb f = 0). Each gap ends at the location 
rib'i + dni (we adopt the convention w nb = 1)- By 
uniformly distributed, we mean that rib is a random vari¬ 
able following a discrete uniform distribution on the in¬ 
terval [0 ,N — 1]. We also allow the gap duration dn 
to be randomly distributed. The window vector is then 
generated by drawing N g realizations of rib and dn. 

The probability P to observe a data at a time n is 
calculated as follows: 


P = P (w n = 1) 

N g -1 

= ]^[ P(n < n bi i or n> nb,i + dni) 


i—0 


= [1 — P (nb < n) + P (nb + dn < n)} Ng . (Al) 
The cumulative probability function of nb is given by: 

n + 1 


P {n b <n) = 


N 


(A2) 


In the case where the duration of the gaps is fixed ( i.e. 
dni = dno Vi), Eq. (Al) gives: 


P (w n = 1) = 


1- 


n + 1 n — dnn + 1 


lN a 


N 


N 


N — dno 


->N„ 


N 


(A3) 


Therefore the probability law of w n is a Bernoulli’s law of 
parameter P. Its expectation is fi w = P and its variance 
is ct 2 = P(1 — P). We notice that P is independent 
of time, and the autocovariance function of w is simply 
R w (n) = cr^<5(n) where 6(n) is the delta Dirac function. 
Then we use Eq. ([5]) to calculate the PSD of the masked 
data: 


/' 2 

S y (f) = P 2 • S z (f) + P( 1 - P) / S z (f')df. (A4) 

"'“2 


Appendix B: Derivation of a simplified equation for 
the OLS variance in the harmonic case 


covariance formula can be written as: 


Al y A 

Var (6) = ™ ™ 


(^AhjAy^j 


2 • 


As reminded in Eq. (10), the covariance matrix is di- 


agonalizablc in the Fourier space. We keep the same 
notations in the following. In addition, we use the fact 
that the Discrete Fourier Transform (DFT) operator is 
a Vandermonde matrix (since M = NI with / the 
identity matrix), therefore the variance can be rewritten 
in terms of the DFT of the windowed model A w , noted 
A w = MA W : 

4 t n A 

Var {6)=Nf ” “ 2 . 


By developing this expression we get: 

Var(<5) = I A wk \ 2 Nf s Sy k 

(Ef^MUl 2 ) 

For the windowed harmonic model A Wn = 
w n jEP sin(27rn/ E p//s + </> ep ), A w is the convolu¬ 
tion of the DFT of the window and the DFT of the 
EP signal. In the case of a random window, \A W \ 
usually peaks at the EP frequency with a value of 
7EpA r 0 /2 where N a is the number of observed data 
(where w n = 1). To simplify the calculations, we neglect 
the terms at all other frequencies. This amounts to 
ignoring the leakage of the harmonic signal (but note 
that the leakage of the noise component is present in 
S v through equation [5]). Furthermore, if we assume 
that the integration period is an integer multiple of the 
EP period {i.e. there exist an integer /cep such that 
/ep = kEpf s /N), then we have: 


Var (5) 


7eP 4 {Sy{fp,p) + Py(— /ep)) 


(2x7Ip?)' 


where we have made the approximation, valid for large 
N, that the DFT of the autocovariance function in Eq. 
(11) is equal to the real PSD. By simplifying we get equa¬ 
tion fT2l 


Var((5) 


2f s NSy{f EP ) 

-^oTep 


Note that in the case of a complete data set {w n = 1 Vn) 
we have N a = N and this formula is more accurate be¬ 
cause the model lA^j exactly peaks at 'y EP N/2. 


We derive here the approximate expression of the vari¬ 
ance of the ordinary least squares estimator used in Sec¬ 
tion [n| 

We start from Eq. In the case of a simple har¬ 
monic model, the matrix A w is a column matrix and the 


Appendix C: State space equation of an AR model 


We detail here the Kalman equations presented in Sec¬ 
tion UTlB] 
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The observation matrix, the model matrix and the 
model noise matrix are defined respectively by: 


H=( 1 , 0 , 

<% 


F = 


.0 y 

i 

0 


0 

\—a 


0 0 

p dp— 1 dp —2 


G = (1,9!,..., g p -{) . 


0 \ 
0 


1 

-aj 


can be formulated as follows: 

£(n|n) = w n |E(n|n — 1) — K {n)H T Yi(n\n — 1)} 

+(1 - w n ) {£(n|n - 1)} , 

x(n\n) = w n {x(n\n — 1) + K(n) ( z{n ) — H T x(n\n — 1))} 
+(1 - w n ) {x(n\n - 1)} , 

where we defined 

K(n) = E(n\n - 1 )H (.ff T £(n|n - l)ff) _1 . 


The vector G whose elements are defined by e n g 3 = 
z n +j-\\ n — z n+ j_i\ n _i can be calculated from the AR 
parameters (see Jones [Hj). 

The Kalman filter equations in presence of missing 
data are briefly reviewed here: 

Prediction equation 


Note that if the data is not observed at time n, the state 
variance and the state vector are not updated and set 
equal to the predicted values at previous time. 
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x(n\n — 1) = Fx{n — 1| n — 1), 

Y,(n\n - 1) = FT,(n - 1 |n - 1 )F T + Q, 

where Q = GG T . 

Update equation 

The update equation adapted to the missing data case 
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